Welcome to Cluster Simulator’s documentation!

Cluster Simulator is a python package that allows to simulate an HPC application execution on a cluster with heterogeneous tier levels. It supports also the use of high performance ephemeral tiers as burst buffers.

Note

This project is funded by the IO-SEA European project.

Installation

Latest release

To install Cluster Simulator, run this command in your terminal: after git-cloning the repo, run the following command in the repo directory:

$ pip install -e cluster_simulator

Quickstart

It is a part of the Recommandation System called Execution Simulator. The big picture of the Recommandation System is shown below.

_images/recommendation_system_diagram.png

A package that simulates I/O application behavior in multi-tiered HPC system.

Features

  • Encodes a simple representation of an application as a sequence of I/O dominant and compute dominant phases.

  • Storage tiers have their own performances and capacities that are shared resources between phases and applications.

  • Support ephemeral tiers that lasts only during the application runtime.

Examples

Simple sequential application

initial imports
import simpy
import time
from cluster_simulator.cluster import Cluster, Tier, bandwidth_share_model, compute_share_model, get_tier, convert_size
from cluster_simulator.phase import DelayPhase, ComputePhase, IOPhase
from cluster_simulator.application import Application
from cluster_simulator.analytics import display_run
application formalism
# preparing execution environment variables
env = simpy.Environment()
data = simpy.Store(env)
# app1 : read 3GB -> compute 15 seconds duration for 1 -> write 7GB
app1 = Application( env, name="app1", # name of the app in the display
                    compute=[0, 15],  # two events, first at 0 and second at 15, and compute between them
                    read=[3e9, 0],    # read 3GB at 0, before compute phase, at the end do nothing (0)
                    write=[0, 5e9],  # write 5GB at first event, and 10GB at the second, after compute phase
                    data=data)        # collected data for monitoring
preparing the cluster compute and storage tiers facilities
ssd_bandwidth =   {'read':  {'seq': 210, 'rand': 190}, # throughput for read ops in MB/s
                   'write': {'seq': 100, 'rand': 100}} # for read/write random/sequential I/O

# register the tier with a name and a capacity
ssd_tier = Tier(env, 'SSD', bandwidth=ssd_bandwidth, capacity=200e9)
hdd_bandwidth = {'read':  {'seq': 80, 'rand': 80},
                 'write': {'seq': 40, 'rand': 40}}

# register the tier with a name and a capacity
hdd_tier = Tier(self.env, 'HDD', bandwidth=hdd_bandwidth, capacity=1e12)

# register the cluster by completing the compute characteristics
cluster = Cluster(env, compute_nodes=3,     # number of physical nodes
                       cores_per_node=2,    # available cores per node
                       tiers=[hdd_tier, ssd_tier]) # associate storage tiers to the cluster
running the application and get traces
# placement list indicated where each I/O run in the tiers hierarchy
env.process(app1.run(cluster, placement=[0, 1])) # run I/O n°1 on first tier (HDD), the second on SSD
env.run()

We get the following (interactive) timeseries plot. The application lasts 102.5 seconds. The first read I/O conveys 5GB of data from the HDD tier at a rate of 80MB/s. Its duration is 37.5 seconds. Then it is followed by a compute dominant phase of 15 seconds. Finally the write I/O phase happens on the SSD tier in 50 seconds at a 100MB/s rate.

Two parallel applications

Now we define two similar applications that will run on the same cluster and place their I/O operations on the same SSD tier.

running concurrent applications on the same cluster
app1 = Application(env, name="app1", read=[4e9, 0], compute=[0, 15],  write=[0, 10e9],
               data=data)
app2 = Application(env, name="app2", read=[7e9, 0], compute=[0, 10],  write=[0, 3e9],
               data=data)

env.process(app1.run(cluster, placement=[1, 1])) # both I/O are placed in SSD
env.process(app2.run(cluster, placement=[1, 1])) # app2 I/O are also in SSD
env.run()

The two apps share equally an available bandwidth of 210MB/s for reading from SSD. Once app1 finishes the I/O reading at t = 38.09 seconds, it frees the bandwidth for the first reading I/O of the application 2. Hence the throughput reaches 210MB/s between 38.09 and 48.09 seconds. After this interval, the writing I/O of app1 starts while the first reading I/O still not finished, so they will share again available bandwidth.

  • TODO

Usage

To use Cluster Simulator in a project:

import cluster_simulator

cluster_simulator

cluster_simulator package

Submodules

cluster_simulator.application module

This module proposes a class to define HPC applications as a sequence of phases that are compute, read or write dominant behavior.

class cluster_simulator.application.Application(env, name=None, compute=None, read=None, write=None, data=None, delay=0)

Bases: object

Defining an application as a sequential set of phases(Compute/Read/Write) that occur in linear fashion. Next phase cannot proceed until previous one is finished.

Each phases has its own attributes and resources dedicated to it. But all phases need computes units(called here cores) in order to run.

Special attention is given to phases compute units: when phases requests different number of cores the application will wait for the availability of the maximum requested cores. Once available within the cluster, this max number if locked until the application execution is finished.

env

the environment object where all the discrete event simulation occurs.

name

a string name can be given to the application that is visible in plotting utilities and logging traces. If not provided, a random string will be given the the application.

compute

a list of any size that contains times where events will occur, and compute phases are between the elements of the list.

read

a list of the same length as compute, for each timestamp indicated in the compute list the read list indicates the volume of data in bytes that will be read during this phase. The duration of the read phase depends on the cluster hardware.

write

a list of the same length as compute. For each time indicated in the compute list, the write list contains the volume of bytes that should be written by the application at this timed event. The duration of the write phase depends on the cluster hardware.

data

(optional) a simpy.Store object that stores the phases schedule of the application and make it available outside of the application.

delay

time in seconds to wait in order to start the application.

Applications running on the same cluster are stackables as well as their relative phases and thus can run in parallel. A start delay can be applied to any application that postpones its scheduling.

get_fitness()

Method to get execution duration of the applications. It iterate over records saved in data to find the phase having the latest timestamp.

Record example: sample_item = {‘app’: ‘B8’, ‘type’: ‘read’, ‘cpu_usage’: 1, ‘t_start’: 0, ‘t_end’: 4.761904761904762, ‘bandwidth’: 210.0, ‘phase_duration’: 4.761904761904762, ‘volume’: 1000000000.0, ‘tiers’: [‘SSD’, ‘NVRAM’], ‘data_placement’: {‘placement’: ‘SSD’}, ‘tier_level’: {‘SSD’: 1000000000.0, ‘NVRAM’: 0}}

Returns

the timestamp of the last event of the session.

Return type

float

put_compute(duration, cores=1)

Add a compute phase that requests some unit cores and lasts a specific duration. It subclasses ComputePhase and object is queued to store attribute and cores needed to cores_request list.

Parameters
  • duration (float) – time in seconds of the compute phase.

  • cores (int) – number of cores

put_delay(duration)

Add a Delay phase that waits before starting the application. This phase consumes 0 units of compute resources. It subclasses DeplayPhase and the object is queued to store attribute and cores needed to cores_request list.

Parameters

duration (float) – time in seconds of the delay phase.

put_io(operation, volume, pattern=1)

Add an I/O phase in read or write mode with a specific volume and pattern. It subclasses IOPhase and the object is queued to store attribute.

Parameters
  • operation (string) – type of I/O operation, “read” or “write”. Cannot schedule a mix of both.

  • volume (float) – volume in bytes of data to be processed by the I/O.

  • pattern (float) – encodes sequential pattern for a value of 1, and a random pattern for value of 0. Accepts intermediate values like 0.2, i.e. a mix of 20% sequential and 80% random. Default value is set to 1.

request_cores(cluster)

Issues a request on compute cores to get a slot as a shared resources. Takes the maximum of requested cores through the application phases as a reference resource amount to lock for the application duration.

Parameters

cluster (Cluster) – accepts an object of type Cluster that contains the compute resources.

Returns

an array of requests of each individual compute unit (core).

run(cluster, placement, use_bb=None)

Launches the execution process of an application on a specified cluster having compute and storage resources with placement indications for each issued I/O. Phases are executed sequentially.

Parameters
  • cluster (Cluster) – a set of compute resources with storage services.

  • tiers (_type_) – _description_

Returns

list of objects that stores scheduled phases of the application.

Return type

data (simpy.Store)

Yields

simpy.Event – relative events.

schedule()

Read the compute/read/write inputs from application attributes and schedule them in a sequential order.

Parameters

status (list of bool) – store the sequential status of each element of the application sequence.

cluster_simulator.cluster module

This module proposes a class to define HPC cluster as a set of compute resources and storage facilities. Both are shared between applications running on the cluster.

class cluster_simulator.cluster.Cluster(env, compute_nodes=1, cores_per_node=2, tiers=[], ephemeral_tier=None, data=None)

Bases: object

A cluster is a set of compute nodes each node having a fixed number of cores. The storage system is heterogenous and consists on a set of tiers. Each tier has its own capacity and performances. Generally speaking, these tiers are referenced as persistent storage, which means that data conveyed through tiers is kept persistently and can be retrieved anytime by an application, unless it is explicitly removed as in data movers methods. A cluster contains also a specific type of tiers that is ephemeral, which means that it does not keep data beyond the execution time of an application. Ephemeral tiers are often supported by datanodes that hold their own compute units and also high storage hardware to serve as burst buffers backend. Burst buffers partitions their resources into flavors to dispatch them smartly between applications. They have also their own policy of eviction when storage are (quasi) saturated as well as a destaging capacity in order to move data to a persistent storage tier. As a consequence, each ephemeral tier has its a specific tier attached to it.

get_levels()

Gathers tiers levels snapshot at a specific time event into a dict.

Returns

snapshot of the tiers levels.

Return type

levels (dict)

get_max_bandwidth(tier, cores=1, operation='read', pattern=1)

Get the maximum bandwidth for a given tier, number of cores dedicated to the operation, a type of operation. Sequential pattern are assumed during copy/move as well as an important blocksize.

Parameters
  • tier (Tier or index) – the tier from which the bandwidth will be estimated.

  • cores (int, optional) – _description_. Defaults to 1.

  • operation (str, optional) – _description_. Defaults to ‘read’.

  • pattern (int, optional) – _description_. Defaults to 1.

Returns

a bandwidth value in MB/s for the specified arguments.

Return type

float

class cluster_simulator.cluster.EphemeralTier(env, name, persistent_tier, bandwidth, capacity=80000000000.0)

Bases: cluster_simulator.cluster.Tier

Ephemeral tiers are tiers that are used for the duration of an application or a workflow. They are attached to a persistent Tier where data will resided at the end of the application or workflow. When the app should read data from a tier 1, and will make use of a transient tier, this data can be prefetched to the transient tier to be accessed later by the app from ephemeral tier. When the app should write data to a target tier 2, and will make use of a transient tier t, this data will be first written to tier t, and then when destaging policy is triggered destage data to tier 2.

Parameters

persistent_tier (Tier) – persistent tier attached to this transient/ephemeral tier where all data conveyed by the application will be found.

evict()

Check if the application should evict some data from the ephemeral tier.

Parameters
  • ephemeral_tier (EphemeralTier) – the ephemeral tier where data is stored.

  • lower_threshold (float) – lower threshold for the eviction.

  • upper_threshold (float) – upper threshold for the eviction.

Returns

amount of data to be evicted.

Return type

eviction_volume (int)

class cluster_simulator.cluster.Tier(env, name, bandwidth, capacity=100000000000.0)

Bases: object

Model a tier storage service with a focus on a limited bandwidth resource as well as a limited capacity. In this model we expect a bandwidth value at its asymptotic state, so blocksize is still not a parameter. Only the asymptotic part of the throughput curve is considered. Other considered variables are read/write variables and sequential/random variables. Output is a scalar value in MB/s. Typically we access the bandwidth value as in dictionary: b[‘read’][‘seq’] = 200MB/s. TODO: extend this to a NN as function approximator to allow:

averaging over variables interpolation when data entry is absent, i.e. b[‘seq’] gives a value

cluster_simulator.cluster.bandwidth_share_model(n_threads)

Description of a bandwidth share model that could extend the behavior from storage services measurements.

Parameters

n_threads (int) – number of threads/processes processing I/O simultaneously.

Returns

the bandwidth share of the last process.

Return type

float

cluster_simulator.cluster.compute_share_model(n_cores)

Description of parallelizing compute resources for an application. The gain factor is considered for a reference duration when using a single unit of computing.

Parameters

n_cores (int) – number of cores (computing unit) the application is distributed on.

Returns

the speedup factor in comparison when using a single compute unit.

Return type

float

cluster_simulator.cluster.convert_size(size_bytes)

Function to display a data volume in human readable way (B, KB, MB,…) instead of 1e3, 1e6, 1e9 bytes.

Parameters

size_bytes (float) – volume of data in bytes to convert.

Returns

containing the volume expressed with a more convenient unit.

Return type

string

cluster_simulator.cluster.get_tier(cluster, tier_reference, use_bb=False)

Cluster has attributes called tiers and ephemeral_tier. The first one is a list and the second one is a single object attached to one of the(persistent) tiers. Reference to a tier could be either a string(search by name) or an integer(search by index in the list of tiers). When a placement refers to a Tier object, and use_bb is False, data will be placed in the tier. When use_bb is True, data will be placed in the ephemeral_tier which is attached to the indicated tier.

Parameters
  • cluster (Cluster) – a cluster object that contains the tiers

  • tier_reference (string, int) – tier name or index in the list of tiers

  • use_bb (bool) – if True, data will be placed in the ephemeral_tier which is attached to the indicated tier.

Returns

The storage tier that will be targeted for I/O operations.

Return type

Tier or EphemeralTier

cluster_simulator.cluster.monitor_step(data, lst)

Monitoring function that feed a queue of records on phases events when an application is running on the cluster.

Parameters
  • data (simpy.Store) – a store object that queues elements of information useful for logging and analytics.

  • lst (dict) – information element to add to the data store.

cluster_simulator.phase module

This module proposes a class for each type of phase that composes an application. While real applications do rarely exhibits pure phases and that are often dominant, we consider that is possible to describe it as a combinations of pure phases of compute, read or write without loss of generality.

class cluster_simulator.phase.ComputePhase(duration, cores=1, data=None, appname=None)

Bases: object

Defining an application phase of type compute which consists only on doing a dominant/intensive compute operation of a certain duration involving some compute units.

duration

duration in seconds of the computing period.

cores

number of compute units involved or consumed during the operation.

data

data store object where application records are kept.

appname

a user specified application name the phase belongs to.

run(env, cluster)

Executes the compute phase.

Parameters
  • env (simpy.Environment) – environment object where all simulation takes place.

  • cluster (Cluster) – the cluster on which the phase will run.

Returns

True if the execution succeeds, False if not.

Return type

bool

Yields

event – yields a timeout event.

class cluster_simulator.phase.DelayPhase(duration, data=None, appname=None)

Bases: object

Defining an application phase of type delay which consists only on waiting a duration in order to start the next phase.

duration

duration in seconds of the waiting period.

data

data store object where application records are kept.

appname

a user specified application name the phase belongs to.

run(env, cluster)

Executes the delay phase by running a simple timeout event.

Parameters
  • env (simpy.Environment) – environment object where all simulation takes place.

  • cluster (Cluster) – the cluster on which the phase will run.

Returns

True if the execution succeeds, False if not.

Return type

bool

Yields

event – yields a timeout event.

class cluster_simulator.phase.IOPhase(cores=1, operation='read', volume=1000000000.0, pattern=1, data=None, appname=None)

Bases: object

Defining an application I/O phase which consists on processing dominantly inputs/outputs during the application runtime. Variables:

current_ios: list of IOPhase instances that are in running state so it is possible to update IOs following a change in bandwidth consumption.

cores

number of compute units involved or consumed during the operation.

operation

specify if it is “read” operation (input data) or “write” operation (output data).

volume

volume in bytes of the data to be operated during the phase.

pattern

describes the pattern encoding with 1 if pure sequential, 0 if pure random, and a float in between.

last_event

float to keep the timestamp of the last I/O event.

next_event

float to keep the timestamp of the next I/O event.

bandwidth_concurrency

int to indicate how many processes are doing I/O.

dirty

int to indicate the amount of data that is dirty in ephemeral tier/RAM (does not have a copy in persistent tier).

data

data store object where application records are kept.

appname

a user specified application name the phase belongs to.

current_ios = []
get_move_duration(cluster, source_tier, target_tier, volume)

Get the adequate step duration to avoid I/O event or volume saturation in tier

Parameters
  • cluster (Cluster) – _description_

  • tier (Tier) – _description_

  • volume (float) – _description_

Returns

_description_

Return type

tuple

get_step_duration(cluster, tier, volume)

Get the adequate step duration to not avoid I/O event or volume saturation in tier

Parameters
  • cluster (Cluster) – _description_

  • tier (Tier) – _description_

  • volume (float) – _description_

Returns

_description_

Return type

tuple

move_step(env, cluster, source_tier, target_tier, erase=False)

Allows to run a movement of data in an interval where available bandwidth is constant.

Parameters
  • env (simpy.Environment()) – environment variable where the I/O operation will take place.

  • cluster (Cluster) – the cluster on which the application will run.

  • source_tier (Tier) – storage tier from which we read data to move.

  • target_tier (Tier) – storage tier where the data will be moved.

Returns

True if the step is completed, False otherwise.

Return type

bool

Yields

simpy.Event – other events that can occur during the I/O operation.

move_volume(step_duration, volume, available_bandwidth, cluster, source_tier, target_tier, erase=False, initial_levels=None)

This method moves a small amount of I/O volume between two predictable events from a source_tier to a target tier with available bandiwdth value. If an event occurs in the meantime, data movement will be interrupted and bandwidth updated accordingly.

Parameters
  • step_duration (float) – the expected duration between two predictible events.

  • volume (float) – volume in bytes of the data to move.

  • cluster (Cluster) – cluster facility where the I/O operation should take place.

  • available_bandwidth (float) – available bandwidth in the step.

  • source_tier (Tier) – storage tier from which we read data to move.

  • target_tier (Tier) – storage tier where the data will be moved.

process_volume(step_duration, volume, available_bandwidth, cluster, tier, initial_levels=None)

This method processes a small amount of I/O volume between two predictable events on a specific tier. If an event occurs in the meantime, I/O will be interrupted and bandwidth updated according.

Parameters
  • step_duration (float) – the expected duration between two predictable events.

  • volume (float) – volume in bytes of the data to process.

  • cluster (Cluster) – cluster facility where the I/O operation should take place.

  • available_bandwidth (float) – available bandwidth in the step.

  • tier (Tier) – storage tier concerned by the I/O operation. It could be reading from this tier or writing to it.

register_step(t_start, step_duration, available_bandwidth, cluster, tier, initial_levels=None, source_tier=None, eviction=None)

Registering a processing step in the data store with some logging.

Parameters
  • t_start (float) – timestamp of the start of the step.

  • step_duration (float) – duration of the step.

  • available_bandwidth (float) – available bandwidth in the step.

  • cluster (Cluster) – the cluster on which the phase will run.

  • tier (Tier) – the tier on which the step will run.

  • initial_levels (dict) – initial levels of all tiers at the start of the step.

  • source_tier (Tier, optional) – the tier from which the step will run.

  • eviction (int, optional) – volume of data which was evicted from ephemeral tier.

run(env, cluster, placement, use_bb=False, delay=0)
run_step(env, cluster, tier)

Allows to run a step of I/O operation where bandwidth share is constant.

Parameters
  • env (simpy.Environment()) – environment variable where the I/O operation will take place.

  • cluster (Cluster) – the cluster on which the application will run.

  • tier (Tier) – the storage tier where the I/O will be processed.

Returns

True if the step is completed, False otherwise.

Return type

bool

Yields

simpy.Event – other events that can occur during the I/O operation.

update_tier(tier, volume)

Update tier level with the algebric value of volume.

Parameters
  • tier (Tier) – tier for which the level will be updated.

  • volume (float) – volume value (positive or negative) to adjust tier level.

update_tier_on_move(source_tier, target_tier, volume, erase)

Update tier level following a volume move.

Parameters
  • source_tier (Tier) – tier from which the data will be moved.

  • target_tier (Tier) – tier for which the level will be updated.

  • volume (float) – volume value (positive or negative) to adjust tier level.

  • erase (bool) – whether or not erase the amount of volume from source_tier.

cluster_simulator.phase.monitor_step(data, lst)

Monitoring function that feed a queue of records on phases events when an application is running on the cluster.

Parameters
  • data (simpy.Store) – a store object that queues elements of information useful for logging and analytics.

  • lst (dict) – information element to add to the data store.

cluster_simulator.utils module

This module contains mainly utility functions that are used through other modules of the cluster simulator package.

class cluster_simulator.utils.BandwidthResource(event_list, *args, **kwargs)

Bases: simpy.resources.resource.Resource

Subclassing simpy Resource to introduce the ability to check_bandwidth when resource is requested or released.

check_bandwidth()

Checks running IO when bandwidth occupation changes. IOs should be interrupted on release or request of a bandwidth slot.

release(*args, **kwargs)

On release method, check_bandwidth using parent release method.

request(*args, **kwargs)

On request method, check_bandwidth using parent request method.

cluster_simulator.utils.compute_share_model(n_cores)

Description of parallelizing compute resources for an application. The gain factor is considered for a reference duration when using a single unit of computing.

Parameters

n_cores (int) – number of cores (computing unit) the application is distributed on.

Returns

the speedup factor in comparison when using a single compute unit.

Return type

float

cluster_simulator.utils.convert_size(size_bytes)

Function to display a data volume in human readable way (B, KB, MB,…) instead of 1e3, 1e6, 1e9 bytes.

Parameters

size_bytes (float) – volume of data in bytes to convert.

Returns

containing the volume expressed with a more convenient unit.

Return type

string

cluster_simulator.utils.get_tier(cluster, tier_reference, use_bb=False)

Cluster has attributes called tiers and ephemeral_tier. The first one is a list and the second one is a single object attached to one of the(persistent) tiers. Reference to a tier could be either a string(search by name) or an integer(search by index in the list of tiers). When a placement refers to a Tier object, and use_bb is False, data will be placed in the tier. When use_bb is True, data will be placed in the ephemeral_tier which is attached to the indicated tier.

Parameters
  • cluster (Cluster) – a cluster object that contains the tiers

  • tier_reference (string, int) – tier name or index in the list of tiers

  • use_bb (bool) – if True, data will be placed in the ephemeral_tier which is attached to the indicated tier.

Returns

The storage tier that will be targeted for I/O operations.

Return type

Tier or EphemeralTier

cluster_simulator.utils.monitor_step(data, lst)

Monitoring function that feed a queue of records on phases events when an application is running on the cluster.

Parameters
  • data (simpy.Store) – a store object that queues elements of information useful for logging and analytics.

  • lst (dict) – information element to add to the data store.

cluster_simulator.utils.name_app(number_of_letter=1, number_of_digits=1)

Give a random but reproducible string that should be enough to be unique for naming phases and applications.

Returns

string concatenating uppercase letters and digits to be easily identifiable.

Return type

string

Module contents

Top-level package for Cluster Simulator.

Contributing

Contributions are welcome, and they are greatly appreciated! Every little bit helps, and credit will always be given.

You can contribute in many ways:

Types of Contributions

Report Bugs

Report bugs at https://gitlab.jsc.fz-juelich.de/io-sea/wp3.4-analytics/.

If you are reporting a bug, please include:

  • Your operating system name and version.

  • Any details about your local setup that might be helpful in troubleshooting.

  • Detailed steps to reproduce the bug.

Fix Bugs

Look through the GitHub issues for bugs. Anything tagged with “bug” and “help wanted” is open to whoever wants to implement it.

Implement Features

Look through the GitHub issues for features. Anything tagged with “enhancement” and “help wanted” is open to whoever wants to implement it.

Write Documentation

Cluster Simulator could always use more documentation, whether as part of the official Cluster Simulator docs, in docstrings, or even on the web in blog posts, articles, and such.

Submit Feedback

The best way to send feedback is to file an issue at https://gitlab.jsc.fz-juelich.de/io-sea/wp3.4-analytics/cluster_simulator/issues.

If you are proposing a feature:

  • Explain in detail how it would work.

  • Keep the scope as narrow as possible, to make it easier to implement.

  • Remember that this is a volunteer-driven project, and that contributions are welcome :)

Get Started!

Ready to contribute? Here’s how to set up cluster_simulator for local development.

  1. Fork the cluster_simulator repo on GitHub.

  2. Clone your fork locally:

    $ git clone https://gitlab.jsc.fz-juelich.de/io-sea/wp3.4-analytics/cluster_simulator.git
    
  3. Install your local copy into a virtualenv. Assuming you have virtualenvwrapper installed, this is how you set up your fork for local development:

    $ mkvirtualenv cluster_simulator
    $ cd cluster_simulator/
    $ python setup.py develop
    
  4. Create a branch for local development:

    $ git checkout -b name-of-your-bugfix-or-feature
    

    Now you can make your changes locally.

  5. When you’re done making changes, check that your changes pass flake8 and the tests, including testing other Python versions with tox:

    $ flake8 cluster_simulator tests
    $ python setup.py test or pytest
    $ tox
    

    To get flake8 and tox, just pip install them into your virtualenv.

  6. Commit your changes and push your branch to GitHub:

    $ git add .
    $ git commit -m "Your detailed description of your changes."
    $ git push origin name-of-your-bugfix-or-feature
    
  7. Submit a pull request through the GitHub website.

Pull Request Guidelines

Before you submit a pull request, check that it meets these guidelines:

  1. The pull request should include tests.

  2. If the pull request adds functionality, the docs should be updated. Put your new functionality into a function with a docstring, and add the feature to the list in README.rst.

  3. The pull request should work for Python 3.5, 3.6, 3.7 and 3.8, and for PyPy. Check https://gitlab.jsc.fz-juelich.de/io-sea/wp3.4-analytics/cluster_simulator/pull_requests and make sure that the tests pass for all supported Python versions.

Tips

To run a subset of tests:

$ python -m unittest tests.test_cluster_simulator

Deploying

A reminder for the maintainers on how to deploy. Make sure all your changes are committed (including an entry in HISTORY.rst). Then run:

$ bump2version patch # possible: major / minor / patch
$ git push
$ git push --tags

Travis will then deploy to PyPI if tests pass.

Credits

Development

Contributors

IO-SEA members are welcome.

History

0.1.0 (2022-06-12)

  • Alpha release

Indices and tables